Generalized Marshall-Olkin exponentiated exponential distribution: Properties and applications

In this study, we propose a generalized Marshall-Olkin exponentiated exponential distribution as a submodel of the family of generalized Marshall-Olkin distribution. Some statistical properties of the proposed distribution are examined such as moments, the moment-generating function, incomplete moment, and Lorenz and Bonferroni curves. We give five estimators for the unknown parameters of the proposed distribution based on maximum likelihood, least squares, weighted least squares, and the Anderson-Darling and Cramer-von Mises methods of estimation. To investigate the finite sample properties of the estimators, a comprehensive Monte Carlo simulation study is conducted for the models with three sets of randomly selected parameter values. Finally, four different real data applications are presented to demonstrate the usefulness of the proposed distribution in real life.


Introduction
Statistical distributions are widely used to model data including survival analysis. Exponential, Weibull, Rayleigh, gamma, and lognormal distributions have central importance in the literature as they are among the most flexible distributions used in survival analysis. However, considering the unlimited number of data generation processes, these distributions alone may be insufficient for modeling. Thus, new distributions and distribution families have been derived in recent years by utilizing existing distributions with derivation methods including transformation, compounding, and exponentiation. The LBeta-G family [1], Kumaraswamy-G family [2], Topp-Leone G family [3], modified beta transmuted-G family [4], exponentiated Weibull distribution [5], Odd-Lidney half-logistic distribution [6], and extended Gumbel distribution [7] are examples of these types of derived distributions. Among others, the weighted exponential [8], Nadarajah-Haghighi exponential [9], exponentiated generalized linear exponential [10], exponentiated Weibull (EW) [11], exponentiated Weibull-Poisson (EWP) [12], extended exponential [13], α-power transformed generalized exponential [14], odd exponentiated halflogistic exponential [15], exponentiated additive Weibull [16], exponentiated Weibull-exponential [17], extended odd Weibull exponential (EOWEx) [18], and bimodal exponential [19] distributions are extensions of the exponential distribution frequently used in survival analysis. Exponentiated distributions are obtained by exponentiating existing distributions. Because they have more parameters, their model fits are better compared to baseline distributions. The idea of exponentiated distributions was first introduced by Lehmann [20]. Exponentiated gamma, exponentiated Weibull, exponentiated Gumbel, and exponentiated Frechet distributions are members of the class of distributions obtained by exponentiation [21]. One of the widely used exponentiated distributions is the exponentiated exponential (EE) distribution. It was introduced by Verhulst [22] following the definition of the general form by Ahuja and Nash [23] and subsequently named by Gupta et al. [24]. The EE distribution with θ and β parameters is shown by EE(θ, β). The probability density function (pdf) and cumulative distribution function (cdf) of this distribution are given as follows: and where θ > 0, β > 0, and x > 0. The EE distribution has a flexible structure in data modeling as it is able to have a decreasing or increasing hazard function depending on the shape parameter. The EE distribution is used in applications such as forecasting precipitation data, software reliability growth models for vital quality metrics, estimating the average life of power system equipment, and recovery rate modeling [25]. Marshall and Olkin [26] discovered a new way to add parameters to a distribution family and they proposed the Marshall-Olkin (MO) distribution family. Sankaran and Jayakumar [27] indicated that the MO family has an odds ratio function. Subsequently, Gillariose et al. [28] described the basic motivations of this distribution family as obtaining models that have more flexible skewness than symmetric distributions and acquiring heavy-tailed distributions relative to the baseline distributions. Finally, the most important motivation was said to be deriving more flexible models by providing various forms of hazard rate functions (HRFs) compared to the baseline distributions. Moreover, the MO family has an explicit interpretation with comprehensive ordering properties, including the pdf and HRF. There are many lifetime distributions in the literature obtained by means of the MO distribution family, such as the MO Frechet distribution [29], beta MO distribution family [30], MO generalized exponential distribution [31], MO alpha power distribution [32], and Weibull MO family [33]. Chesneau et al. [34] proposed a generalization of the MO family, which they called the generalized Marshall-Olkin (GMO) distribution. It is remarkable that their obtained model is more flexible than the original MO distribution family. The pdf and cdf of the GMO distribution family are as follows: where α, λ 2 (0,1], and F(x) and f(x) are the cdf and pdf functions of the baseline distribution, respectively. When λ = 1 is placed in Eq 4, the standard MO distribution family is obtained [26]. This study aims to propose a GMO exponentiated exponential distribution with EE baseline distribution, derived from the GMO family. We represent this GMO exponentiated exponential distribution as GMO-EE (α, λ, θ, β) with parameters α, λ, θ, and β hereafter. There are two main reasons for choosing the EE as the baseline distribution. First, the EE is more effective than the two-parameter Weibull and two-parameter gamma distributions in data analysis. Second, as mentioned before, the EE has both increasing and decreasing HRFs [25]. The first motivation for our new model arises from the easy acquisition of the reliability function and HRF, as the cdf is quite simple. The second motivation is that its decreasing, increasing, upside-down bathtub, bathtub-shaped, constant, and increasing-decreasing-increasing HRFs can be used effectively for data modeling, especially in reliability analysis, hydrological, biological, and engineering applications. The most important motivation for our model is that it can be used as an alternative to the Weibull, gamma, EE, and EWP models in the literature. In this study, a submodel GMO-EE distribution is introduced, benefiting from the GMO distribution family. In Section 2, the survival function, HRF, quantile function, moment-generating function, and moments are obtained. Section 3 provides the maximum likelihood estimator (MLE), least squares estimator (LSE), weighted least squares estimator (WLSE), Cramer-von Mises estimator (CVME), and Anderson-Darling estimator (ADE) for unknown parameters of the GMO-EE distribution. In Section 4, a Monte Carlo simulation study is conducted to compare the performances of the estimators in terms of biases and mean square errors (MSEs) for each parameter. In Section 5, four real data applications are presented to show the applicability of the GMO-EE model in real life. The final section concludes the study.

GMO-EE distribution and its properties
In this section, we introduce a submodel GMO-EE survival distribution. Suppose that X is a random variable from the GMO-EE distribution. In this case, the pdf and cdf of X are given by the following: and where η = (α, λ, θ, β) is the parameter vector and α, λ 2 [0, 1), θ, β 2 R + are the parameters. The survival function and HRF for the GMO-EE distribution are respectively given as follows: and

Moment-generating function
In this subsection, the moment-generating function of the GMO-EE distribution is obtained. Let X be a random variable having distribution GMO-EE(α, λ, θ, β).

Theorem 1:
For any x and α provided that (1α)[1 -F(x)] 2 (0,1), the pdf can be expanded to the series by using the following: where f s (x) = (s + 1)f(x) (F(x)) s and u l; The moment-generating function for the GMO-EE via Theorem 1 is given by the following: Þdx are calculated as follows: By using the transformation of u = exp(-θx) in Eqs 11 and 12, the following equations are respectively obtained: where G a ð Þ ¼ x aÀ 1 exp À x ð Þdx is defined. Accordingly, the moment-generating function is expressed as

Lemma 1:
Binomial series expansion for any n > 0 is given by the following: Let us now consider the r th moment of the GMO-EE distribution with parameters α, λ, θ, and β. The r th moment is expressed as follows: where M � r;0 ¼ x r f s x ð Þdx are calculated as follows: Using the series expansion in Eq 16, the integrals of Eqs 18 and 19 can be computed as follows: Thus, via Eqs 20 and 21, the r th moment is obtained by: If r = 1 and r = 2 are taken in Eq 22, the first two moments are obtained in the form of the first and second moment. The r th incomplete moment of random variable X having distribution GMO-EE(η) is given by the following:

Bonferroni and Lorenz curves
The Bonferroni and Lorenz curves are basic methods used to analyze data in the areas of economics and reliability. The Bonferroni and Lorenz curves for the GMO-EE(η) distribution are respectively given by the following: where μ is the first moment and q = Q(p) denotes the quantile function.

Quantile function
The quantile function for the distribution GMO-EE(x, η) = p, p 2 (0,1) is obtained as follows: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi

Maximum likelihood estimation method
Let X 1 , X 2 ,. . ., X n be a random sample from GMO-EE(x, η). The log-likelihood function is given by the following: Accordingly, the MLEs for the unknown parameters of η = (α, λ, θ, β) are obtained by solving the following equations:

Least square and weighted least square estimation
Let X 1 , X 2 , . . ., X n be a random sample taken from GMO-EE(x, η). X 1:n , X 2:n , . . ., X n:n are the order statistics in this sample. The expected value and variance of the empirical distribution function are as follows: Hence, the LSEs for the unknown parameters of distribution GMO-EE(α, λ, θ, β) are obtained by minimizing the objective function given as follows: Thus, the LSEs for the parameters,â LSE ;l LSE ;ŷ LSE , andb LSE , are obtained by solving the systems of equations given by Eqs 35-38: The WLSEs of parameters α, λ, θ and β,â WLSE ;l WLSE ;ŷ WLSE , andb WLSE , are obtained by minimizing the objective function given by the following: where w i = (n + 1) 2 (n + 1)/(i(n-i + 1)).

Anderson-Darling and Cramer-von Mises estimation
The Anderson-Darling method is based on the Anderson-Darling goodness-of-fit statistic proposed by Anderson and Darling [36]. Accordingly, ADEs for unknown parameters of the GMO-EE distribution are obtained by minimizing the following objective function: The Cramer-von Mises method, like the LSE and WLSE, is based on goodness-of-fit for the difference between the cdf and empirical distribution function. Thus, the Cramer-von Mises estimatorsâ CvME ;l CvME ;ŷ CvME , andb CvME can be obtained by minimizing the following:

Simulation study
In this section, performances for the MLE, LSE, WLSE, ADE, and CvME estimators of the unknown parameters of the GMO-EE distribution are assessed according to mean biases and MSEs. The data are randomly generated from three GMO-EE models with selected parameter vectors η 1 = (0.4,0.8,1,0.5), η 2 = (0.5,0.7,3,0.9)0, and η 3 = (0.2,0.5,2,0.75). The mean biases and MSEs are obtained in 5000 replications with sample sizes of 50, 100, 150, 200, 250, 500, 750, and 1000 for each of the models. However, the estimators of the parameters cannot be obtained in closed form. The BFGS, Nelder-Mead, CG, and L-BFGS-B [37] algorithms, which are numerical methods in R software [38], can be used to obtain the estimates of the parameters. The mean biases and MSEs in terms of the sample sizes and the five aforementioned estimators for the three models with η 1 , η 2 , and η 3 are given in Tables  1-3, respectively.  When Tables 1-3 are examined, it is seen that the mean biases and MSEs decrease steadily as the number of samples increases. In addition, with the increase in sample size, the mean Table 1. Mean bias and MSE for the model with α = 0.4, λ = 0.8, θ = 1, and β = 0

Real data analysis
In this section, four real data applications are presented to compare the fits of the GMO-EE distribution and competing distributions. For this purpose, some comparative statistics such as the Cramer-von Mises (CvM), Kolmogorov-Smirnov (K-S), and Anderson-Darling (AD) test statistics with their p-values are applied for the four datasets together with Akaike's information criterion (AIC) and -2 × log-likelihood values.

Dataset 1
The first dataset is attained from the number of successive failures of the air conditioning systems of the 13 members of a fleet of Boeing 720 jet airplanes. This dataset has been used in previous studies [39,40].  [41], generalized binomial exponential-II (GBE-II) [42], EOWEx [18], EWP [12], and exponential distributions. MLEs with standard errors for unknown parameters of the fitted distributions and the comparative statistics are given in Tables 4 and 5, respectively. The cdf plots of the fitted distributions are shown in Fig 3. As seen from Table 5, the GMO-EE distribution outperforms the one-parameter exponential, two-parameter W and EE, and three-parameter MOEBXII and GBE-II distributions. Satisfactory and comparable model fits are provided by the three-parameter EOWEx and fourparameter EWP, while the best results are obtained by the GMO-EE except for the smaller AIC value of the EOWEx.

Dataset 3
The third real data set contains the exceedances of flood peaks (in m 3 /s) of the Wheaton River near Carcross in Yukon Territory, Canada. These data were used in previous studies [45,46]. and 27.0. We analyze this dataset to compare the GMO-EE with the EE [24], W, MOEBXII [41], GBE-II [42], EOWEx [18], EWP [12], and exponential distributions. The analysis results are given in Tables 8 and 9, and cdf plots of the fitted distributions are shown in Fig 5. As seen from Table 9, the best fitted model is the GMO-EE for all selection criteria.

Conclusion
In this study, we have introduced the GMO-EE distribution with (α, λ, θ, β) parameters as a sub-model of the GMO distribution family. We have obtained some statistical properties of the new model, such as the moment-generating function, moments, incomplete moments, and Lorenz and Bonferroni curves. Since the GMO-EE distribution has hazard ratio functions with the shape of an upside-down bathtub, bathtub-shaped, increasing, decreasing, constant, and increasing-decreasing-increasing as depicted in Fig 2 for different parameter values, it can be regarded as a flexible distribution for modeling. Moreover, we have provided five different estimation methods for the unknown parameters of the GMO-EE distribution and conducted a Monte Carlo simulation study to evaluate the performances of the estimators. According to the simulation results, the mean biases and MSEs decrease progressively as sample sizes increase. Four real datasets were fitted to the GMO-EE and some competing distributions to compare them in terms of model fits. The GMO-EE was found to be the best fitted model according to -2log, AD with its p(AD), CvM with its p(CvM), and K-S with its p(K-S) criteria among   [48] reported an average recovery time of approximately 15 days for another sample of male COVID-19 patients over the age of 60 years (n = 582) and Barman et al. [49] obtained the 95% confidence interval for average recovery time of 16 to 34 days based on another sample of COVID-19 patients (n = 221). The probability of recovery within 2 weeks was calculated as 44.62% based on the GMO-EE distribution, while Tanış [47] found it to be 45.25%. The results obtained from the GMO-EE distribution are thus supported by the findings of previous studies. From the satisfactory results of these real data applications, the applicability of the GMO-EE model in real life is clear. In light of our results, we anticipate that the proposed model can be used to fit data obtained from a broad range of fields including survival analysis, meteorology, economics, biology, hydrology, and other applications in life sciences and engineering. Although there are more parsimonious models in the literature, the GMO-EE may still be used effectively thanks to its upside-down bathtub and bathtub-shaped HRFs for modeling biological, clinical, and mortality data in particular. Moreover, the proposed model can be considered as an alternative to the extensions of exponential and Weibull distributions. Further studies based on the GMO-EE distribution could address topics such as parameter estimation of censored data and lifetime regression.